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1. Introduction 

Compressed sensing [T1I2] is an area of signal recovery which has recently attracted a great 
deal of attention thanks to its large potential for applications; it enables reconstruction of 
sparse signals xq G with far fewer linear measurements than traditionally required. 
In this note we investigate what happens to reconstruction quality when theoretical 
conditions on the measurement matrix are relaxed and replaced by (less favorable) 
conditions that are encountered in realistic inverse problems. 

When a signal Xq is known to be sparse (i.e. has few nonzero components), a 
practical method to obtain accurate reconstruction results from measurements Kxq is 
the ii minimization technique [Il|3lll]: 

X = arg min (1) 

Kx=Kxo 

{K is a known m x n operator). ||x||i denotes the £i-norm of a vector: = 
Under certain conditions it can be shown that the ii reconstruction x equals the 
unknown sparse input signal Xq exactly [5]. Data obtained from real measurements 
are more often than not corrupted by significant quantities of noise; this means that 
Kxq remains unknown and instead y = Kxq + noise is acquired. In this case, the ill- 
posed problem can be regularized by adding a sparsity promoting £i-norm penalty to 
a quadratic misfit functional (assuming Gaussian noise) [Ej. In other words, the input 
signal can then be recovered from the noisy data y by minimizing the convex functional 

x{\) = argmin \\Kx - y\\'^ + 2A||x||i (2) 

X 

for a suitable choice of the penalty parameter A. || ■ || stands for the £2-norm: 
II a II ^ = |ajp. Contrary to the noiseless case, the recovered signal ([2]) will not be 
exactly equal to the input signal xq. 

The potential of the ii minimization ([1]) and ii penalization method ([2]) for the 
reconstruction of sparse signals has already been assessed extensively, both from a 
theoretical point of view and with numerical simulations [714T2]. Nonetheless, most 
research focuses almost exclusively on matrices which exhibit mutually incoherent 
columns or matrices which (likely) satisfy the 'Restricted Isometry Property' (RIP) |13j . 
such as e.g. random matrices, structured random matrices (e.g. random rows of a 
Fourier matrix) or matrices composed of the union of certain bases. In most practical 
situations it is difficult to verify if a matrix satisfies the RIP. Here we set out to study 
the quality of reconstruction when the measurement matrix K does not fit this ideal 
category. One may e.g. have good a priori control on the sparsity of the desired model 
(by the choice of a suitable basis), but one may not have sufficient control on the physical 
measurements to make the matrix satisfy the RIP. Indeed, a single matrix column with 
small or zero norm will destroy the RIP, but attempting sparse recovery may still be 
appropriate or worthwhile in practice. 

In this paper, we investigate the infiuence of the noise level in the data and of the 
singular value spectrum of the measurement matrix K on the ability of the £i method 
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([2]) to faithfully reconstruct a sparse input signal Xq. We also discuss the behavior of 
the recovery as a function of the sparsity of the input signal and as a function of the 
indeterminacy of the system (number of rows with respect to number of columns in K); 
we provide practical predictions on the relative error of the recovered sparse signal with 
respect to the sparse ground truth model. In order to accomplish this, a large number 
of numerical simulations is performed and the results are displayed using the diagrams 
introduced in [9l[Tl] . We also provide an illustration of the practical use of the method 
in a magnetic tomography setting. 

2. Assessment method 

We consider an input signal xq G M" with k non-zero coefficients, and a data vector 
y G M"*, with m < n, such that y = Kxq + t] for a m x n matrix K and a noise vector 
T]. We want to assess the effectiveness of the ii penalization method (jj]) for recovering 
Xq, using the knowledge of K and of y, as a function of the spectrum of the matrix K, 
of the noise level e = ||?7||/||-R'xo||, of the number of data m and of the sparsity k of xq- 

The success of the £i method (|2]) for the recovery of a sparse signal Xq will depend 
on the indeterminacy of the linear system and on the sparsity of the input signal. A 
concise graphical representation of the success rate of £i-based compressed sensing was 
introduced in [9l[Tl]. Likewise, we will use the parameters 6 = m/n and p = k/m (not 
k/n) and perform a number of experiments for various values of 6 and p. More precisely, 
we will use a cartesian grid in the 6 — p-plane and, for each grid point, set up input 
data, a matrix K and a noise vector 77, determine the minimizer x(A) and compare 
it to Xq. This experiment is repeated several times over, for different K, xq and 77, 
but for fixed spectra and values of e. Afterwards the whole experiment is repeated for 
different spectra and different values of e. Do note that the 6 — p-plane does not give 
a uniform description of all possible inverse problems. It is specifically tailored towards 
the evaluation of sparse recovery. 

In this work we use 40 equidistant points for 6 and p with values ranging from 
0.025 to 1. The matrices K have a fixed number of columns {n = 800). The experiment 
is repeated 100 times in each grid point, for each spectrum and each value of e. The 
same experiments with a larger number of columns allow us to conclude that the result 
presented in this paper do not depend on this quantity. 

In case of noiseless data, the success rate of the recovery strategy can be measured 
by simply computing the proportion of successful reconstructions (x = xq). In the case 
of noisy data, one will never have perfect reconstruction {x{X) = Xq), and therefore 'good 
recovery' needs to be defined in a different way. We measure the success of recovery 
by calculating the mean of the relative reconstruction error e = ||a;(A) — xo||/||xo|| over 
several trials. It is this number that will be plotted as a function of 5 = m/n and 
p = k/m. 

When assessing the influence of the spectral properties of K on the success of the 
method ((21), we take into account the change in spectrum of a matrix by the addition of 
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Figure 1. Mean over 100 examples of the normalized spectra of matrices (see section 
[2|) of the following type: random (solid), type 2 (dashed) and type 3 (dotted), k = 10^. 
The matrices are of size 200 x 800 (a), 400 x 800 (b) and 800 x 800 (c). 



extra measurements (rows). We therefore choose the mxn matrices K as submatrices of 
nxn matrices in the following way: We draw an nxn matrix containing random numbers 
from the Gaussian distribution with zero mean and unit variance. Three different types 
of spectral behavior are then considered. "Type 1" is obtained by taking the random 
matrix itself. For the other types, we first calculate the singular value decomposition 
and replace the singular values by 

s^ = s,K^^-iy(^-^) (3) 

(with si ^ 0) for "type 2", and by 

s^ = s,^('-''y(-'-') (4) 

for "type 3". Several values of k will be chosen in the next section. The singular 
vectors remain untouched. For all three types, the mxn matrices K are then found by 
randomly selecting m different rows from these nxn matrices. As seen in figure [H the 
spectrum changes with m. We believe this corresponds to the behavior encountered in 
real problems. 

The position of the k non-zero entries of the input vector xq, as well as their values, 
were randomly generated from a uniform distribution. 

For a given matrix K and a given sparse input signal xq, synthetic data are 
constructed by setting y = Kxq + t] where 77 is a m x 1 vector with entries taken 
from a Gaussian distribution with zero mean. The noise level is determined by the 
parameter e = ||r/||/||/s:xo||. We will choose e = 0.02,0.05,0.10,0.20 and 0.50. 

The penalty parameter A in (j2]) is chosen to satisfy Morozov's discrepancy principle: 
||/rx(A) — y\\ = \\r]\\. In other words, we will fit the data up to the level of the noise. 
In practice, and for these problems sizes, this can be achieved easily by using the (non- 
iterative) Homotopy/LARS/lasso method [T514T7] for the solution of ([2]). 
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Figure 2. Mean, over 100 trials, of the relative error e ~ ||a;(A)— a;o||/||.To|| as a function 
of 5 and p for a noise level of 10%. (a) Gaussian random matrices, (b) matrices of type 
2 and (c) matrices of type 3 with k — W^. 



3. Results 

In order to assess the effect of tlie different variables on the accuracy of the solution of 
([2]), the mean of the reconstruction error e is plotted in the 6 — p-plane for the three 
classes of matrices described in section [2] with k = 10^ and noise level of 10% (see 
figure [2]). The quality of the ii penalization reconstruction depends strongly on the 
spectrum of the matrix K. In particular we see that good results for random matrices 
must be considered as a best case scenario. The results are less favorable for the two 
other types of matrices. 

Secondly, in order to combine results for different spectra of K in one plot, the 
behavior of the mean relative error e2e = 2e is studied. In figure [3] (a) and (b), the 
relative accuracy is plotted for various matrices K and e = 2% and 10% respectively. 
Only very sparse vectors Xq are shown in these plots (0.025 < p < 0.2) as the results 
deteriorate rapidly for larger p. These two e2e curves in the 5 — p-plane (for the noise 
levels of 2% and 10%) are quite similar in character. We also simulated noise levels of 
5%, 20% and 50% and observed the same features. 

The same sparse setting as figure 3(a,b) was used in figure 3 (c) and (d) where we show 
the reconstruction errors e for matrices of type 2 with k = 10^ and k = 10^^ respectively, 
and e = 0.1. 

It is not surprising that the above results depend on the spectrum of K. Any change 
in the spectrum of a matrix K also induces a change in the condition number of the 
column submatrices of K. As the restricted isometry property (RIP) [I3] implies that 
these submatrices are well conditioned, our setting makes it more difficult /impossible 
for a matrix K to satisfy a RIP. As argued below, even when the RIP is not satisfied, 
a sparse recovery may still be feasible. 

In [18] it is stated that when 

(1 - S2k)\\z\\ < \\Kz\\ < (1 + 62k)\\z\\, Vz : ll^llo < 2k, (5) 

and (52fc < 2/(2 + /f/i) (|| 2 II is the number of non-zero coefficients of z), any vector xq, 
with ||xo||o < k can be recovered by ii minimization with an error ||x — xqH < c||?7||. The 




Figure 3. (a) and (b): Behavior in the 6 — p-plane of the mean relative error e2e 
corresponding to two times the noise level e with respectively e = 2% and 10%. The 
reconstructions were done for the following matrices: Gaussian random (dotted), type 
2 (solid) and type 3 (dashed). The value of k is displayed on the curve. Panels (c) and 
(d): Mean relative errors on reconstructions for matrices of type 2 with respectively 
K = 10^ and K = 10^^ and e = 0.1. The stars indicate the position in the S — p-plane 
of the input model of an inverse problem in magnetic tomography described in section 

H 



constant c only depends on 62k and 77 is the noise vector as defined in section [2J This 
condition imphes that the condition numbers {K2k) of all the 2k column submatrices of 
K are bounded from above by: 

^ 1.6498, (6) 

+ V7 

As an illustration, we consider a matrix K (of type 1, 2, 3 resp.) of size 200 x 800 
{6 = 0.25) and choose 10000 random 20-column submatrices of K. The average 
condition number (^20) of those submatrices is calculated: (^20) ~ 1-8 for type 1 
matrices, (^20) ~ 2.7 and (^20) ~ 2.0 for matrices of type 2 and 3 respectively with 
K, = 10^. We see that (^20) lies above the upper bound ([6]). This means that for 
the three types of matrices studied here, there exists some column submatrices with 
20 columns, which do not satisfy condition (|6]). The matrix K (even normalized) will 
therefore not satisfy ([S]) for /c = 10 (which corresponds to p = 0.05) and therefore [TS] 
does not guarantee the accuracy of the reconstruction of a vector with 10 (or more) 
nonzero coefficients. However, as seen on figure |2] and [HI the £1 penalization performs 
reasonably well, even when the vector to be reconstructed is less sparse than p = 0.05. 
We can therefore conclude that the bounds obtained from the RIP theory are quite 
restrictive. 

The same conclusion can be drawn for the results obtained in [10], where the 
behavior of the solution of ([2]) with respect to the mutual coherence of the matrix 
K is studied. Inspired by their analytical results, the authors only studied input signals 
with a very small (maximum) number of nonzero coefficients {k = 3) for a matrix of 
size 128 X 256. This matrix is a concatenation of the identity matrix and the Hadamard 
matrix, with the columns normalized to unit £2 norm. For that case we have calculated 
the value of the mean of the condition number (^2^) over 10^ submatrices with 12 
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columns of K (i.e. k = 6 and p = 0.0469) and found 1.4216, with a maximum of 1.7495. 
From the present simulations we therefore conclude that the ii penalization method can 
still yield accurate results for less sparse solutions than imposed by the bound in [TO] . 

Most of the vectors Xq studied here are less sparse than the bounds found in the 
theoretical studies mentioned above. Therefore, our simulations give an indication of 
the accuracy of the results obtainable in practice with ii penalization for various types 
of matrices. They are of practical use to predict the relative reconstruction error in 
realistic inverse problem settings. 

4. Illustration 

We illustrate the previous results on the predicted accuracy of sparse recovery by ii 
penalization with the help of an inverse problem in magnetic tomography. Our toy 
problem aims to reconstruct a 2D current distribution J on part of a sphere from 
measurements of the (normal component of the) magnetic field above this surface. The 
current distribution J and the data B are assumed to be linked by the formula [T9l|20] : 

m = ^ [ /(Ox^^dy. (7) 

Att Jy \r — r\-^ 

We also assume that the unknown currency distribution J is divergence- free: div J = 0. 

In our toy problem the domain V is part of a thin spherical shell 0.089 < r < 0.090 
(all units are SI units) which we parametrize by coordinates ^ and 77: 

a: = -tan,^, i/ = -tan?7, z=- and s = v 1 + tan^ ,^ + tan^ 77 (8) 

s s s 

(with — 7r/3 < ^,r] < vr/3). The coordinate lines of this parametrization are angularly 
equidistant great circles |21]. The shell covers just over one quarter of the whole sphere. 
Furthermore we assume that the current distribution has no radial component. The 
divergence-free angular current distribution J(^, rj) is then parametrized by the field 
F(^,?7): J = curl x ^(^,77)!^ (the dependence on r is not taken into account here). 

The domain V is divided into 64^ voxels and we choose an input model -F'°(^, rj) 
that is sparse in the CDF 4-2 wavelet basis [22]. The input model has only 60 nonzero 
coefficients in this basis (out of a possible 4096). Formula ([7j) is used to calculate the 
(normal component of the) magnetic field in 1000 randomly distributed points above the 
patch — 7r/3 < ^,?7 < n/3 at r = 0.1 (one centimeter above the patch). 10% Gaussian 
noise is added to these data. On this 64 x 64 grid we therefore have 1000 data versus 
4096 unknowns F{^i,rii). 

The singular value spectrum of this 1000 x 4096 matrix (in the wavelet basis) is 
plotted in figurelHwhere it is compared to the spectrum of the matrices used in sections[2] 
andO Its spectrum is quite similar to the spectrum of a matrix of type 2 with k = 10^^. 
Although the spectra are alike, it is important to point out that some of the (wavelet) 
basis functions lie close to the null space of the matrix. In other words, the norms of the 
columns of the toy problem matrix are quite different from those of the matrices used 
in the setting of the preceding sections (see figure H]) . The average reconstruction errors 
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Figure 4. Left: The spectrum of the matrix used in the toy magnetic tomography 
experiment. Center: The singular value spectra of matrices of type 2 (solid) with 
K = 10^ (top) and n = 10^^ (bottom) and a rescaled spectrum of the matrix of the toy 
problem (dotted). Right: A histogram of columns norms of the toy problem matrix 
(blue) and the matrix of (b) with k = 10^'^ (grey) . They are both normalized by their 
respective matrix spectral norms. 



found in section |3] should therefore be understood as a lower bound: If one or more of 
the basis functions (used to express the sparsity of the model) lies more or less in the 
null space of the operator, the reconstruction quality may be further reduced. 

In our simulation we attempt to reconstruct the input model using an ii 
penalization approach (to exploit the sparsity of the model in the wavelet basis), and a 
traditional £2 penalty approach for comparison. For the former we use the so-called 'Fast 
iterative soft-thresholding algorithm' [23], for the latter we use the conjugate gradient 
method (the variational equations are linear in that case). The ii reconstruction of the 
field F{^,T]) has a relative reconstruction error of 31%, the generic £2 method of 87%. 
Comparing this to the results on figure [3] we see that, for 6 = 1000/4096 = 0.2441 and 
p = 60/1000 = 0.06, the predicted relative reconstruction error also lies around 30% 
for the £1 penalty method. A similar experiment for the £2 penalty method predicts an 
error around 95%. The input model and the two reconstructions are shown in figure |5l 

5. Conclusion 

Sparse recovery can be a very powerful tool for inverse problems especially when 
(structured) random matrices or other matrices that satisfy the Restricted Isometry 
Property (RIP) are concerned. However these cases constitute a theoretical best- 
case scenario. In this note we investigate the reconstruction performance when such 
conditions are loosened and this 'compressed sensing' framework is abandoned. Indeed, 
most matrices discussed here do not satisfy the RIP for submatrices of non-negligible 
size, but reasonably good reconstruction results can still be obtained. 

By means of extensive numerical simulations we assessed the performance of the £1 
recovery method (j2]) for realistic, non-idealized, linear systems with noisy data. Various 
levels of sparsity of the model, different numbers of data, and different noise levels 
and spectral behaviors of the measurement matrices were studied. It was shown that 
the mean relative reconstruction error e grows sharply when the framework of random 
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Figure 5. The current density model J used in the synthetic sparse magnetic 
tomography experiment. Arrows represent the field J whereas color indicates the 
field F. Left: the input model. Center: the ii penalized reconstruction. Right: the £2 
penalized reconstruction. The £1 reconstruction is less noisy and its amplitude is more 
faithful to the input model than the £2 reconstruction. 



matrices is abandoned. This means that accurate reconstruction results can only be 
achieved for problems with a more sparse solution. Nonetheless the known (RIP) 
bounds from [18] seem too restrictive in practice. These bounds guarantee an accurate 
reconstruction for all vectors that are sufficiently sparse. But, although loosening this 
condition will sometimes provide bad reconstructions, we showed that it is often possible 
to obtain reconstructions of acceptable quality of less sparse vectors. 

Figures [2] and [3] allow the reader to estimate the relative reconstruction error based 
on the shape of the measurement matrix (number of rows and columns), on the expected 
sparsity of the model and on the spectral properties of the measurement matrix. These 
predicted errors unfortunately are not an upper bound but a lower bound, in the 
sense that the simulations did not take into account all factors that could potentially 
negatively influence ii penalized reconstructions. Other factors that play an important 
role on the quality of reconstructions are e.g. the distribution of the matrix column 
norms (i.e. whether basis vectors lie in the null space of the matrix, in contradiction 
to the RIP), and the presence of (nearly) identical columns. In the latter case, the 
ii penalty will not impose a unique minimizer to functional (E]) and such a penalty is 
therefore from the outset not a suitable regularization method for that inverse problem. 

We also conclude that the ii method responds well to an increased amount of 
noise: When the noise level increases, less sparse vectors can be recovered to the same 
relative accuracy e2e- The magnetic tomography example shows the practical usefulness 
of the numerical study, and there is good agreement between that toy problem and the 
numerical simulations. 
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